Predicting gene functions from multiple biological sources using novel ensemble methods

نویسندگان

  • Chandan K. Reddy
  • Mohammad S. Aziz
چکیده

The functional classification of genes plays a vital role in molecular biology. Detecting previously unknown role of genes and their products in physiological and pathological processes is an important and challenging problem. In this work, information from several biological sources such as comparative genome sequences, gene expression and protein interactions are combined to obtain robust results on predicting gene functions. The information in such heterogeneous sources is often incomplete and hence making the maximum use of all the available information is a challenging problem. We propose an algorithm that improves the performance of prediction of different models built on individual sources. We also develop a heterogeneous boosting framework that uses all the available information even if some sources do not provide any information about some of the genes. We demonstrate the superior performance of the proposed methods in terms of accuracy and F-measure compared to several imputation and integration schemes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Positive Unlabeled Learning for Disease Gene Identification

An increasing number of genes have been experimentally confirmed in recent years as causative genes to various human diseases. The newly available knowledge can be exploited by machine learning methods to discover additional unknown genes that are likely to be associated with diseases. In particular, positive unlabeled learning (PU learning) methods, which require only a positive training set P...

متن کامل

Identification of Alzheimer disease-relevant genes using a novel hybrid method

Identifying genes underlying complex diseases/traits that generally involve multiple etiological mechanisms and contributing genes is difficult. Although microarray technology has enabled researchers to investigate gene expression changes, but identifying pathobiologically relevant genes remains a challenge. To address this challenge, we apply a new method for selecting the disease-relevant gen...

متن کامل

AUC-based biomarker ensemble with an application on gene scores predicting low bone mineral density

MOTIVATION The area under the receiver operating characteristic (ROC) curve (AUC), long regarded as a 'golden' measure for the predictiveness of a continuous score, has propelled the need to develop AUC-based predictors. However, the AUC-based ensemble methods are rather scant, largely due to the fact that the associated objective function is neither continuous nor concave. Indeed, there is no ...

متن کامل

Predicting gene expression from heterogeneous data

The complexity of gene expression and the elucidation of the mechanisms involved in its regulation constitute an extremely difficult challenge in modern bioinformatics despite the amount of information made recently available by high-throughput biotechnologies and genome-wide investigations. In this contribution we investigated the effectiveness of ensemble systems for gene expression predictio...

متن کامل

A committee machine approach for predicting permeability from well log data: a case study from a heterogeneous carbonate reservoir, Balal oil Field, Persian Gulf

Permeability prediction problem has been examined using several methods such as empirical formulas, regression analysis and intelligent systems especially neural networks and fuzzy logic. This study proposes an improved and novel model for predicting permeability from conventional well log data. The methodology is integration of empirical formulas, multiple regression and neuro-fuzzy in a commi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • International journal of data mining and bioinformatics

دوره 12 2  شماره 

صفحات  -

تاریخ انتشار 2015